Beyond Frequencies: Graph Paern Mining in Multi-weighted Graphs
نویسندگان
چکیده
Graph pattern mining aims at identifying structures that appear frequently in large graphs, under the assumption that frequency signi�es importance. Several measures of frequency have been proposed that respect the apriori property, essential for an e�cient search of the patterns. This property states that the number of appearances of a pattern in a graph cannot be larger than the frequency of any of its sub-patterns. In real life, there are many graphs with weights on nodes and/or edges. For these graphs, it is fair that the importance (score) of a pattern is determined not only by the number of its appearances, but also by the weights on the nodes/edges of those appearances. Scoring functions based on the weights do not generally satisfy the apriori property, thus forcing many approaches to employ other, less e�cient, pruning strategies to speed up the computation. The problem becomes even more challenging in the case of multiple weighting functions that assign di�erent weights to the same nodes/edges. In this work, we provide e�cient and e�ective techniques for mining patterns in multi-weight graphs. We devise both an exact and an approximate solution. The �rst is characterized by intelligent storage and computation of the pattern scores, while the second is based on the aggregation of similar weighting functions to allow scalability and avoid redundant computations. Both methods adopt a scoring function that respects the apriori property, and thus they can rely on e�ective pruning strategies. Extensive experiments under di�erent parameter settings prove that the presence of edge weights and the choice of scoring function a�ect the patterns mined, and hence the quality of the results returned to the user. Finally, experiments on datasets of di�erent sizes and increasing numbers of weighting functions show that, even when the performance of the exact algorithm degrades, the approximate algorithm performs well and with quite good quality.
منابع مشابه
Beyond Frequencies: Graph Pattern Mining in Multi-weighted Graphs
Graph pattern mining aims at identifying structures that appear frequently in large graphs, under the assumption that frequency signifies importance. Several measures of frequency have been proposed that respect the apriori property, essential for an efficient search of the patterns. This property states that the number of appearances of a pattern in a graph cannot be larger than the frequency ...
متن کاملWIGM: Discovery of Subgraph Patterns in a Large Weighted Graph
Many research areas have begun representing massive data sets as very large graphs. Thus, graph mining has been an active research area in recent years. Most of the graph mining research focuses on mining unweighted graphs. However, weighted graphs are actually more common. The weight on an edge may represent the likelihood or logarithmic transformation of likelihood of the existence of the edg...
متن کاملFrequent subgraph mining algorithms on weighted graphs
This thesis describes research work undertaken in the field of graph-based knowledge discovery (or graph mining). The objective of the research is to investigate the benefits that the concept of weighted frequent subgraph mining can offer in the context of the graph model based classification. Weighted subgraphs are graphs where some of the vertexes/edges are considered to be more significant t...
متن کاملOn Symmetry of Some Nano Structures
It is necessary to generate the automorphism group of a chemical graph in computer-aided structure elucidation. An Euclidean graph associated with a molecule is defined by a weighted graph with adjacency matrix M = [dij], where for i≠j, dij is the Euclidean distance between the nuclei i and j. In this matrix dii can be taken as zero if all the nuclei are equivalent. Otherwise, one may introduce...
متن کاملMining Edge-Weighted Call Graphs to Localise Software Bugs
An important problem in software engineering is the automated discovery of noncrashing occasional bugs. In this work we address this problem and show that mining of weighted call graphs of program executions is a promising technique. We mine weighted graphs with a combination of structural and numerical techniques. More specifically, we propose a novel reduction technique for call graphs which ...
متن کامل